Complex Rasters, Temperature NC Data
Today’s agenda includes:
nc filesWorking with lights was simple
In many other contexts, we can have more complex rasters
Temperature data from CEDA is an example of a raster that contains time series
Temperature in 2019
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download temperature data from by going to https://archive.ceda.ac.uk
You can download the CRU file (3.04 GB)
Place it in a folder called “temp”
Here is how we can read the file
This is the type of file we deal with
We then we extract the time component
Let’s print the first 10 entries
We now turn those numbers into Date objects
We can select all the dates from 1901 and 2022
We next need to identify the index of those dates
We can now extract those rasters from the spatraster b using those indices
Here is for example what the 1901 spatraster looks like:
class : SpatRaster
dimensions : 360, 720, 24 (nrow, ncol, nlyr)
resolution : 0.5, 0.5 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
sources : cru_ts4.07.1901.2022.tmp.dat.nc:tmp (12 layers)
cru_ts4.07.1901.2022.tmp.dat.nc:stn (12 layers)
varnames : tmp (near-surface temperature)
stn
names : tmp_1, tmp_2, tmp_3, tmp_4, tmp_5, tmp_6, ...
unit : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ...
time (days) : 1901-01-16 to 1901-12-16
We can then save the names of the rasters:
This is what we have covered so far:
library(terra)
library(sf)
library(stars)
#Step1: Read the raster
b <- rast("./datal17/temp/cru_ts4.07.1901.2022.tmp.dat.nc")
#Step2: Extract time information
time_values <- time(b)
#Step3: Turning strings into time
date_objects <- as.Date(time_values)
#Step4: Select only the dates from the year 1901 and 2022
dates_1901 <- unique(date_objects[format(date_objects, "%Y") == "1901"])
dates_2022 <- unique(date_objects[format(date_objects, "%Y") == "2022"])
#Step5: Find the index of the selected dates in the time_values
selected_indices_1901 <- which(time_values %in% as.Date(dates_1901))
selected_indices_2022 <- which(time_values %in% as.Date(dates_2022))
#Step6: Extract raster values for the selected dates
selected_rasters_1901 <- b[[selected_indices_1901]]
selected_rasters_2022 <- b[[selected_indices_2022]]
#Step7: Extract layer names
layer_names_1901 <- names(selected_rasters_1901)
layer_names_2022 <- names(selected_rasters_2022)We then extract the the index of the layers that start with tmp
We need to do this because there are other types of laters inside this .nc file: stn - number of stations contributing to each interpolation
We then select only the temperature rasters
Here is what tmp_layers_1901 looks like:
class : SpatRaster
dimensions : 360, 720, 12 (nrow, ncol, nlyr)
resolution : 0.5, 0.5 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source : cru_ts4.07.1901.2022.tmp.dat.nc:tmp
varname : tmp (near-surface temperature)
names : tmp_1, tmp_2, tmp_3, tmp_4, tmp_5, tmp_6, ...
unit : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ...
time (days) : 1901-01-16 to 1901-12-16
We then do the same for 2022
Here is what tmp_layers_2022 looks like
class : SpatRaster
dimensions : 360, 720, 12 (nrow, ncol, nlyr)
resolution : 0.5, 0.5 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source : cru_ts4.07.1901.2022.tmp.dat.nc:tmp
varname : tmp (near-surface temperature)
names : tmp_1, tmp_2, tmp_3, tmp_4, tmp_5, tmp_6, ...
unit : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ...
time (days) : 1901-01-16 to 1901-12-16
We can then rename the layers
We should rename those using:
Finally, this is how we rename them:
We then need to turn our spatraster object into a raster object
Notice the difference
lapply()lapply() is a fundamental function in R used to apply a given function to each element of a list or vector.
The name “lapply” stands for “list apply,” highlighting its primary usage with lists.
Syntax:
lapply(X, FUN, …)
lapply()This what that final dataframe looks like:
In R, it’s common to use anonymous or inline functions for concise code
These functions are defined on-the-fly and typically used when the function is simple and doesn’t require reusability.
Thus, we can write:
or
We then take the cell mean per raster for the whole year
Notice that the input is stars_list_1901
stars_list_1901 is a list of stars object containing 12 objects
You can see below the first object in the stars_list_1901
You can see below the second object in the stars_list_1901
Thus the following function takes the average temperature per cell the whole year
Also not that st_apply is a function similar to lapply, but only applies to sf objects.
We then combine the list of results into a single stars object
do.call() Function calls a function with a list of arguments
We then combine the list of results into a single stars object
These are the stape that we covered in part 2.
#Step8: Find indices of layers that start with "tmp"
#tmp stands for temperature. There are other rasters within these files.
indices_tmp_1901 <- grep("^tmp", layer_names_1901)
tmp_layers_1901 <- selected_rasters_1901[[indices_tmp_1901]]
indices_tmp_2022 <- grep("^tmp", layer_names_2022)
tmp_layers_2022 <- selected_rasters_2022[[indices_tmp_2022]]
#Step9: Changing raster names to temperature
names(tmp_layers_1901)<-dates_1901
names(tmp_layers_2022)<-dates_2022
#Step10: Turning the spatraster into raster
stars_list_1901 <- lapply(tmp_layers_1901, function(obj) st_as_stars(raster::raster(obj)))
stars_list_2022 <- lapply(tmp_layers_2022, function(obj) st_as_stars(raster::raster(obj)))
#Step11: Use lapply to apply st_apply to each stars object to calculate the mean
mean_list_1901 <- lapply(stars_list_1901, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
mean_list_2022 <- lapply(stars_list_2022, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
#Step12: Combine the list of results into a single stars object
mean_stars_1901 <- do.call(stars::st_as_stars, mean_list_1901)
mean_stars_2022 <- do.call(stars::st_as_stars, mean_list_2022)
#Step13: Changing the name of the rasters
names(mean_stars_1901)<-"avg_tmp_1901"
names(mean_stars_2022)<-"avg_tmp_2022"Temperature in 1901
Temperature in 1901
Temperature in 2022
Temperature in 2022
Now plotting was to some extent straightforward.
Now what if we wanted to extract the average temperature for every single year and make a time plot?
This is where our programming skills come in handy from earlier lectures
library(terra)
# Step 1: Read the raster
b <- rast("./datal17/temp/cru_ts4.07.1901.2022.tmp.dat.nc")
# Step 2: Extract time information
time_values <- time(b)
# Step 3: Turning strings into time
date_objects <- as.Date(time_values)
# Step 4: Get unique years
unique_years <- unique(format(date_objects, "%Y"))
#All years take too long, so we will only choose two years
unique_years <-c("2021", "2022")
# Step 5: Create an empty list to store mean stars for each year
mean_stars_list <- list()This is where our programming skills come in handy from earlier lectures
# Step 6: Iterate over each unique year
for (year in unique_years) {
cat("Calculating average for year", year, "\n")
# Select dates for the current year
selected_dates <- unique(date_objects[format(date_objects, "%Y") == year])
# Find the index of the selected dates in the time_values
selected_indices <- which(time_values %in% as.Date(selected_dates))
# Extract raster values for the selected dates
selected_rasters <- b[[selected_indices]]
# Extract layer names
layer_names <- names(selected_rasters)
# Find indices of layers that start with "tmp"
indices_tmp <- grep("^tmp", layer_names)
# Extract temperature layers
tmp_layers <- selected_rasters[[indices_tmp]]
# Change raster names to temperature
names(tmp_layers) <- selected_dates
# Convert spatraster into raster
stars_list <- lapply(tmp_layers, function(raster_obj) st_as_stars(raster::raster(raster_obj)))
# Use lapply to apply st_apply to each stars object to calculate the mean
mean_list <- lapply(stars_list, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
# Combine the list of results into a single stars object
mean_stars <- do.call(stars::st_as_stars, mean_list)
# Change the name of the raster
names(mean_stars) <- paste("avg_tmp", year, sep = "_")
# Add the mean stars to the list
mean_stars_list[[year]] <- mean_stars
cat("Average calculation for year", year, "completed\n\n")
}Calculating average for year 2021
Average calculation for year 2021 completed
Calculating average for year 2022
Average calculation for year 2022 completed
This is where our programming skills come in handy from earlier lectures
# Combine all the mean stars into a single stars object
# Initialize an empty data frame
result_df <- data.frame()
# Loop through each year and calculate the mean
for (year in names(mean_stars_list)) {
#Creating an object where we look through the years
avg_tmp_col_name <- paste0("avg_tmp_", year)
#Calculating mean and keeping track of the year
mean_value <- mean(mean_stars_list[[year]][[avg_tmp_col_name]], na.rm = TRUE)
#Saving everything into a dataframe
result_df <- rbind(result_df, data.frame(year = as.numeric(year), average_temperature = mean_value))
}
# Writing the df to a dataframe
write.csv(result_df, "./datal17/output/average_temp_2021_2022.csv", row.names = FALSE)So, we wrote our dataframe to a csv file.
Note that we did this dataframe for two years only: 2021 and 2022.
To complete the loop for all years, put a # in front of
unique_years <-c("2021", "2022")
This means that unique_years is an object made out of
unique(format(date_objects, "%Y"))
In other words, all years from 1901 until 2022. But this takes a long time (i.e. a few hours): multiply how long it took for 2021 and 2022 by 50.
To save you time, I have already done all the years on my machine. You can download the csv file with all the years.
Put it in the relevant folder
#write.csv(result_df, "./datal17/output/average_temp_1901_2022.csv", row.names = FALSE)
#Reading the file
average_temp<-read.csv("./datal17/output/average_temp_1901_2022.csv")
#Subsetting to below 2001
average_temp_1901_2000<-subset(average_temp, year<2001)
#Calculating the 1901-2000 average
average_temp_1901_2000<-mean(average_temp_1901_2000$average_temperature)
#Calculation the deviations from the 1901-2000 average
average_temp$dev_1901_2000_avg<-average_temp$average_temperature-average_temp_1901_2000Creating a time trend
Creating a time trend
Creating a bar plot with the values
# Create a new column for bar color based on the condition
average_temp$bar_color <- ifelse(average_temp$dev_1901_2000_avg > 0, "red", "blue")
# Create the bar plot
ggplot(average_temp, aes(x = year, y = dev_1901_2000_avg, fill = bar_color)) +
geom_bar(stat = "identity", color = "black") +
scale_fill_manual(values = c("blue", "red"), guide="none") +
theme_bw() +
labs(title = "Time Trend of Average Temperature",
x = "Year",
y = "Difference from 1901-2000 avearge (degrees C)")Creating a bar plot with the values
This is very close to the official figures produced by NOAA (National Ceneter for Environmental Information)
We have covered a lot of operations related to rasters
We learned about:
Popescu (JCU): Lecture 17